88 research outputs found

    Learning midlevel image features for natural scene and texture classification

    Get PDF
    This paper deals with coding of natural scenes in order to extract semantic information. We present a new scheme to project natural scenes onto a basis in which each dimension encodes statistically independent information. Basis extraction is performed by independent component analysis (ICA) applied to image patches culled from natural scenes. The study of the resulting coding units (coding filters) extracted from well-chosen categories of images shows that they adapt and respond selectively to discriminant features in natural scenes. Given this basis, we define global and local image signatures relying on the maximal activity of filters on the input image. Locally, the construction of the signature takes into account the spatial distribution of the maximal responses within the image. We propose a criterion to reduce the size of the space of representation for faster computation. The proposed approach is tested in the context of texture classification (111 classes), as well as natural scenes classification (11 categories, 2037 images). Using a common protocol, the other commonly used descriptors have at most 47.7% accuracy on average while our method obtains performances of up to 63.8%. We show that this advantage does not depend on the size of the signature and demonstrate the efficiency of the proposed criterion to select ICA filters and reduce the dimensio

    Using natural versus artificial stimuli to perform calibration for 3D gaze tracking

    No full text
    International audienceThe presented study tests which type of stereoscopic image, natural or artificial, is more adapted to perform efficient and reliable calibration in order to track the gaze of observers in 3D space using classical 2D eye tracker. We measured the horizontal disparities, i.e. the difference between the x coordinates of the two eyes obtained using a 2D eye tracker. This disparity was recorded for each observer and for several target positions he had to fixate. Target positions were equally distributed in the 3D space, some on the screen (with a null disparity), some behind the screen (uncrossed disparity) and others in front of the screen (crossed disparity). We tested different regression models (linear and non linear) to explain either the true disparity or the depth with the measured disparity. Models were tested and compared on their prediction error for new targets at new positions. First of all, we found that we obtained more reliable disparities measures when using natural stereoscopic images rather than artificial. Second, we found that overall a non-linear model was more efficient. Finally, we discuss the fact that our results were observer dependent, with variability's between the observer's behavior when looking at 3D stimuli. Because of this variability, we proposed to compute observer specific model to accurately predict their gaze position when exploring 3D stimuli

    Image retrieval : a first step for a human centered approach

    No full text
    International audienceImage indexing using content analysis is known as a difficult task, involving the vision research domain. Using these tools in the context of a retrieval system is generally frustrating for users, due to a lack of interfaces development, and to the difficulty for users to understand the low-level features managed by the system. We propose in this paper a general point of view for introducing a link between such systems and potential users. This includes image features based on visual perception models, a relevance feedback model, and a graphical interface to express the information need through user-system interaction

    Statistical modeling of the influence of a visual distractor on the following eye-fixations

    No full text
    International audienceWe examined the influence of a visual distractor appearing during a fixation on the following fixations during natural exploration. It is known that new objects, congruent or incongruent with the scene, appearing during a fixation are fixated more than chance [Brockmole, J. R., & Henderson, J. M. (2008). Prioritizing new objects for eye fixation in real-world scenes: Effects of object-scene consistency. Vis. Cog., 16(2-3), 375-390]. In this study, we replicated this result using a Gabor patch for the appearing object, called a distractor because it was artificial and non-related to scenes. Besides, we wanted to quantify its influence on the exploration. A statistical model of the fixation density function was designed to analyze how the exploration was disrupted from and after the onset of the distractor. The model was composed of a linear weighted combination of different maps modeling three independent factors influencing gaze positions. We wondered whether fixation locations observed were rather due to the distractor or the saliency of the scenes. As expected, at the beginning of the exploration, fixation locations were not randomly chosen but influenced by the saliency of the scene and the distractor. The distractor onset strongly influenced fixations and this influence decreased with time

    Model of Cortical Cell Processing to Estimate Binocular Disparity

    No full text
    International audienceThe starting point of our work are the physiological and psychophysical studies made on 3D vision, we attempt to build a model of stereoscopic vision. Hence, we used 2D Gabor filters to model the simple and complex cells sensitive to horizontal binocular disparity (Barlow 1967, Daugman 1985). Each of these cells has a preferred disparity and is sensitive to spatial frequency and orientation. It has been shown by Prince et al (2002) that the range of preferred disparities depends on the spatial frequency. We designed a bank of filters in which the distribution of preferred disparity follows the same principle. Moreover, since the stereo-threshold was found to be increasing with the magnitude of disparity inside each spatial frequency channel, the disparity distribution is not uniform. We took the energy model of Ohzawa et al (1986) as a basis since it has been demonstrated that it fits well with the disparity sensitive cells response from V1 in front of most of stimuli. We modified the classical model by normalizing the complex binocular response by the monocular complex response. We took different measures to reduce false matches such as a pooling procedure and an orientation averaging already used by Chen and Qian (2004). As already demonstrated for 2D vision, a coarse-to-fine process seems to be the best way to deal with multiple spatial frequency channels for stereoscopic vision (Smallman 1995, Menz and Freeman 2003). The first estimation based on low spatial frequencies determines if the estimation will be refined channels depending on its inclusion in the disparity range of the higher spatial frequency channel

    Estimation of overlapped Eye Fixation Related Potentials: The General Linear Model, a more flexible framework than the ADJAR algorithm

    Get PDF
    The Eye Fixation Related Potential (EFRP) estimation is the average of EEG signals across epochs at ocular fixation onset. Its main limitation is the overlapping issue. Inter Fixation Intervals (IFI) - typically around 300 ms in the case of unrestricted eye movement- depend on participants’ oculomotor patterns, and can be shorter than the latency of the components of the evoked potential. If the duration of an epoch is longer than the IFI value, more than one fixation can occur, and some overlapping between adjacent neural responses ensues. The classical average does not take into account either the presence of several fixations during an epoch or overlapping. The Adjacent Response algorithm (ADJAR), which is popular for event-related potential estimation, was compared to the General Linear Model (GLM) on a real dataset from a conjoint EEG and eye-tracking experiment to address the overlapping issue. The results showed that the ADJAR algorithm was based on assumptions that were too restrictive for EFRP estimation. The General Linear Model appeared to be more robust and efficient. Different configurations of this model were compared to estimate the potential elicited at image onset, as well as EFRP at the beginning of exploration. These configurations took into account the overlap between the event-related potential at stimulus presentation and the following EFRP, and the distinction between the potential elicited by the first fixation onset and subsequent ones. The choice of the General Linear Model configuration was a tradeoff between assumptions about expected behavior and the quality of the EFRP estimation: the number of different potentials estimated by a given model must be controlled to avoid erroneous estimations with large variances

    How a distractor influences fixations during the exploration of natural scenes

    Get PDF
    The distractor effect is a well-established means of studying different aspects of fixation pro-gramming during the exploration of visual scenes. In this study, we present a task-irrelevant distractor to participants during the free exploration of natural scenes. We investigate the con-trol and programming of fixations by analyzing fixation durations and locations, and the link between the two. We also propose a simple mixture model evaluated using the Expectation-Maximization algorithm to test the distractor effect on fixation locations, including fixations which did not land on the distractor. The model allows us to quantify the influence of a visual distractor on fixation location relative to scene saliency for all fixations, at distractor onset and during all subsequent exploration. The distractor effect is not just limited to the current fixa-tion, it continues to influence fixations during subsequent exploration. An abrupt change in the stimulus not only increases the duration of the current fixation, it also influences the location of the fixation which occurs immediately afterwards and to some extent, in function of the length of the change, the duration and location of any subsequent fixations. Overall, results from the eye movement analysis and the statistical model suggest that fixation durations and locations are both controlled by direct and indirect mechanisms

    Analyse de séquences oculométriques et d'électroencéphalogrammes par modÚles markoviens cachés

    Get PDF
    National audienceThis work aims at analysing sequences of eye movements. These sequences were measured during reading tasks involving information acquisition so as to take decisions. Their analysis, based on hidden semi-Markov chains, highlights different phases of acquisition, which can be related to particular characteristics in multichannel electroencephalograms measured synchronously with eye movements during the reading tasks. This analysis reveals changes associated with the different phases of information acquisition, occurring in variances and correlations between channels at specific frequencies.Cette étude vise à analyser des séquences de mouvements oculaires collectées au cours de tùches de lecture visant à acquérir de l'information à des fins décisionnelles. L'analyse, basée sur des semi-chaßnes de Markov cachées, met en évidence différentes phases d'acquisition, qui sont alors reliées à des caractéristiques de signaux électroencéphalographiques multicanaux collectés concomitamment à la lecture. Cette analyse permet de révéler des changements de variance et de corrélation entre canaux suivant les phases et les bandes de fréquence

    Hidden semi-Markov models to segment reading phases from eye movements

    Get PDF
    Our objective is to analyze scanpaths acquired through participants achieving a reading task aiming at answering a binary question: Is the text related or not to some given target topic? We propose a data-driven method based on hidden semi-Markov chains to segment scanpaths into phases deduced from the model states, which are shown to represent different cognitive strategies: normal reading, fast reading, information search, and slow confirmation. These phases were confirmed using different external covariates, among which semantic information extracted from texts. Analyses highlighted some strong preference of specific participants for specific strategies and more globally, large individual variability in eye-movement characteristics, as accounted for by random effects. As a perspective, the possibility of improving reading models by accounting for possible heterogeneity sources during reading is discussed

    The influence of the visualization task on the Simulator Sickness symptoms - a comparative SSQ study on 3DTV and 3D immersive glasses

    No full text
    International audienceThe human factors are an essential aspect to take into consideration in order to explain the level of public acceptability of new stereo- scopic devices. A study using the Simulator Sickness Questionnaire allowed us to illustrate the differences in symptoms after the visual- ization of 3D images on a 3DTV screen and on a pair of prototype immersive 3D glasses. Also, the results of our study showed that the visualization task influenced the exploration of the scenes, and there- fore influenced the evolution of the simulator sickness symptoms
    • 

    corecore